WHAT IS CLAIMED IS: 



1 1 . A computer-implemented method for generating superunits from a 

2 concept network, the concept network including a plurality of units and a plurality of 

3 relationships defined between pairs of the plurality of units, wherein each relationship has an 

4 associated edge weight, the method comprising the acts of: 

5 identifying a superunit seed comprising at least one member unit, wherein 

6 each member unit is one of the plurality of units of the concept network; 

7 defining a signature for the superunit seed, the signature including one or more 

8 signature units, wherein each signature unit has a relationship in the concept network with at 

9 least a minimum number of the member units; 

1 0 expanding the superunit seed by adding one or more new member units from 

1 1 the concept network, wherein each new member unit satisfies a match criterion based on the 

12 signature; 

13 modifying the signature based on the expanded superunit seed; 

14 repeating the acts of expanding and modifying until a convergence criterion is 

15 satisfied, wherein a final superunit and a final signature are formed once the convergence 

16 criterion is satisfied; and 

17 storing superunit membership information for each member unit of the final 

18 superunit. 

1 2. The method of claim 1, wherein the concept network is generated from 

2 a set of previous search queries. 

1 3. The method of claim 1, wherein the act of storing the superunit 

2 membership information includes the acts of: 

3 computing a membership weight for each member unit of the final superunit, 

4 wherein the membership weight is based on the relationships in the concept network between 

5 the member unit and the signature units of the final signature, 

6 wherein the stored supenmit membership information includes the 

7 membership weight. 

1 4. The method of claim 1, further comprising the act of generating the 

2 concept network from the previous queries. 

1 5. The method of claim 1, further comprising the acts of: 
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2 subsequently to the act of modifying the signature and prior to the step of 

3 repeating, purging the superunit seed by removing a member unit that does not satisfy the 

4 match criterion based on the modified signature, 

5 wherein the act of purging is repeated subsequently to repeating the step of 

6 modifying. 

1 6. The method of claim 1, wherein the convergence criterion is satisfied 

2 if, as a result of the act of repeating, membership of the superunit seed changes by no more 

3 than a maximum number of units. 

1 7. The method of claim 1, wherein the convergence criterion is satisfied 

2 if, as a result of the act of repeating, membership of the signature changes by no more than a 

3 maximxmi number of units. 

1 8. The method of claim 1, wherein the act of identifying the superunit 

2 seed includes the act of forming a cluster of two or more units as the superunit seed, wherein 

3 each unit in the cluster has at least one neighbor unit in common with a base unit of the 

4 cluster. 

1 9. The method of claim 8, wherein the act of forming the cluster includes 

2 the acts of: 

3 selecting a base unit and a candidate unit from the concept network; 

4 identifying a plurality of neighbor units of the base unit, wherein each 

5 neighbor unit has a relationship in the concept network to the base unit; 

6 identifying at least one of the neighbor imits as a matched unit, wherein each 

7 matched unit has a relationship in the concept network to the candidate unit; 

8 computing a clustering weight for the candidate unit based on the plurality of 

9 neighbor units including the at least one matched unit; and 

10 based on the clustering weight, determining whether to include the candidate 

1 1 unit in a cluster with the base unit. 

1 10. The method of claim 1 , wherein the act of identifying the superunit 

2 seed includes forming a clique of two or more closely related units. 
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1 11. The method of claim 1 , wherein the act of identifying the superunit 

2 seed includes the act of receiving a list of units from an external source, the list of units being 

3 usable as a superunit seed. 

1 12. The method of claim 11, wherein the extemal source comprises a web 

2 page. 

1 13. The method of claim 1 1 , wherein the act of identifying the superunit 

2 seed fixrther includes the act of pruning the list of units to remove a unit that is not in the 

3 concept network. 

1 14. The method of claim 1, wherein the act of identifying the superunit 

2 seed includes the acts of: 

3 receiving user behavior data related to the previous queries; and 

4 detecting similarities in the user behavior data related to previous queries 

5 containing different units. 

1 15. The method of claim 14, wherein the user behavior data includes click 

2 through information for the previous queries. 

1 16. The method of claim 1, wherein the act of identifying the superunit 

2 seed includes the acts of: 

3 detecting occurrences of units of the concept network in a source document; 

4 and 

5 generating a superunit seed based on the detected occurrences. 

1 17. The method of claim 1, wherein the relationships between the units of 

2 the concept network include one or more of an association relationship, an extension 

3 relationship, and an altemative relationship. 

1 18. The method of claim 1, wherein the act of defining the signature 

2 includes the acts of: 

3 identifying as the signature units a plurality of units in the concept network 

4 that have a specified relationship with at least a minimum nvunber of the member xmits of the 

5 associated superunit seed; and 

6 establishing a threshold number. 
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7 wherein the step of expanding the superunit seed includes: 

8 selecting a candidate unit from the concept network; and 

9 adding the candidate unit to the superunit seed in the event that the 

10 candidate unit has the specified relationship with at least the threshold number of the 

1 1 signature units. 

1 19. The method of claim 18, wherein the threshold number is established 

2 relative to a total number of signature units by reference to a predetermined fraction. 

1 20. The method of claim 1, wherein defining the signature includes the 

2 acts of: 

3 identifying as the signature units a plurality of units in the concept network 

4 that have a specified relationship with at least a minimum number of the member units of the 

5 associated superunit seed; 

6 establishing an edge weight range for each signature imit; and 

7 establishing a threshold number, 

8 wherein the step of expanding the superunit seed includes: 

9 selecting a candidate unit from the concept network; 

10 determining a first number equal to a number of the signature units 

1 1 with which the candidate unit has the specified relationship and has an edge weight 

12 within the edge weight range for that signature unit; and 

13 adding the candidate unit to the superunit seed in the event that the first 

14 number is equal to or greater than the threshold number. 

1 21. The method of claim 1, further comprising, subsequently to the act of 

2 storing, the acts of: 

3 receiving a current query; 

4 parsing the current query into one or more constituent units; 

5 retrieving the stored superunit membership information for one or more of the 

6 constituent units; and 

7 formulating a response to the current query based at least in part on the 

8 retrieved superunit membership information. 
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1 22. The method of claim 21, wherein the act of fomiulating the response 

2 includes the act of using the superunit membership information to suggest a related search 

3 query. 

1 23. The method of claim 22, wherein the related search query includes a 

2 first unit that is one of the member units of the superunit, wherein the first unit is not a 

3 constituent unit of the current query. 

1 24. The method of claim 22, wherein the related search query includes a 

2 first unit that is one of the signature units of the superunit, wherein the first unit is not a 

3 constituent unit of the current query. 

1 25. The method of claim 2 1 , wherein the act of formulating the response 

2 includes the act of using the superunit membership information to suggest a web site for a 

3 sideways search. 

1 26. The method of claim 21, wherein one of the constituent units is a 

2 member of more than one superunit and wherein the act of formulating the response includes 

3 the act of using the superunit membership information to group response data according to 

4 the superunits to which the one of the constituent units belongs. 

1 27. The method of claim 2 1 , wherein the act of formulating the response 

2 includes the act of using the superunit information to resolve an ambiguity of a first one of 

3 the constituent units based on comparing another of the constituent units to signature units for 

4 one or more superunits of which the first constituent xmit is a member, 

1 28. The method of claim 21, wherein the act of formulating the response 

2 includes the act of using the superunit information to select sponsored content to be 

3 displayed. 

1 29. A system for generating superunits from user search queries, the 

2 system comprising: 

3 a concept network builder module configured to generate a concept network 

4 fi-om a plurality of previous queries, the concept network including a plurality of units and a 

5 . plurality of relationships defined between pairs of the plurality of units, wherein each 

6 relationship has an associated edge weight; 
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7 a superunit seed module configured to identify a superunit seed comprising at 

8 least one member imit, wherein each member unit is one of the plurality of units of the 

9 concept network; 

10 a superunit builder module configured to construct superunits and signatures 

1 1 starting with the superunit seeds, wherein each superunit includes a plurality of member units 

12 and wherein each signature is associated with one of the superunits, wherein each signature 

13 includes one or more signature units, wherein each signature unit has a relationship in the 

14 concept network with at least a minimum number of the member units of the associated 

1 5 superunit; and 

1 6 a storage module configured to store superunit membership information for 

1 7 the member units, wherein the superunit membership information is provided by the 

1 8 superunit builder module. 

1 30. The system of claim 29, wherein the superunit builder module is 

2 fiirther configured to define a signature for each superunit seed, to expand the superunit seed 

3 by adding one or more new member units firom the concept network, wherein each new 

4 member unit satisfies a match criterion based on the signature, to modify the signature based 

5 on the expanded superunit seed, and to repeat the steps of expanding and modifying until a 

6 convergence criterion is satisfied, wherein a final superunit and a final signature are formed 

7 once the convergence criterion is satisfied. 

1 31. The system of claim 30, wherein the superunit builder module is 

2 further configured to compute a membership weight for each member unit of the final 

3 superunit, wherein the membership weight is based on the relationships in the concept 

4 network between the member unit and the signature units of the final signature, and to store 

5 the membership weight in the storage module. 

1 32. The system of claim 30, wherein the convergence criterion is satisfied 

2 if, during the repetition, membership of the superunit seed changes by less than a maximum 

3 number of units. 

1 33. The system of claim 30, wherein the convergence criterion is satisfied 

2 if, during the repetition, membership of the signature changes by less than a maximum 

3 number of units. 
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34. The system of claim 29, wherein the superunit seed module is further 
configured to identify a cluster of two or more units as the superunit seed, wherein each unit 
in the cluster has at least one neighbor unit in common. 

35. The system of claim 34, wherein the superunit seed module is further 
configured to select at least two candidate units from the concept network, to identify a 
plurality of neighbor units of the candidate units, wherein each neighbor unit has a 
relationship in the concept network to one or more of the candidate units, to compute a 
clustering weight for the candidate units based on the plurality of neighbor units, and to 
determine, based on the clustering weight, whether to form a cluster from the candidate units. 

36. The system of claim 34, wherein the superunit seed module is further 
configured to receive a list of units from an external source, the list of units being usable as a 
superunit seed. 

37. The system of claim 34, wherein the superunit seed module is further 
configured to receive user behavior data related to the previous queries and to detect 
similarities in the user behavior data related to previous queries containing different units. 

38. The system of claim 34, wherein the superunit seed module is further 
configured to detect occurrences of units of the concept network in a source document and to 
generate a superunit seed based on the detected occurrences. 

39. The system of claim 29, further comprising: 

a query response module coupled to the storage module and configured to 
receive a current query, to parse the current query into one or more constituent units, to 
retrieve from the storage module the superunit membership information for one or more of 
the constituent units, and to formulate a response to the current query based at least in part on 
the retrieved superunit membership, information. 

40. A computer program product comprising a computer readable medium 
encoded with program code, the program code including: 

program code for identifying a superunit seed comprising at least one member 
unit, wherein each member unit is one of a plurality of units of a concept network, the 
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5 concept network including a plurality of units and a plurality of relationships defined between 

6 pairs of the plurality of units, wherein each relationship has an associated edge weight; 

7 program code for defining a signature for the superunit seed, the signature 

8 including one or more signature units, wherein each signature unit has a relationship in the 

9 concept network with at least a minimum number of the member units; 

10 program code for expanding the superunit seed by adding one or more new 

1 1 member units from the concept network, wherein each new member unit satisfies a match 

12 criterion based on the signature; 

13 program code for modifying the signature based on the expanded superunit 

14 seed; 

15 program code for repeating the steps of expanding and modifying until a 

16 convergence criterion is satisfied, wherein a final superunit and a final signature are formed 

17 once the convergence criterion is satisfied; and 

1 8 program code for storing superunit membership information for each member 

19 unit of the final superunit. 

1 41 . The computer program product of claim 40, wherein the program code 

2 fiirther includes: 

3 program code for receiving a current query; 

4 program code for parsing the current query into one or more constituent imits; 

5 program code for retrieving the stored superunit membership information for 

6 one or more of the constituent units; and 

7 program code for formulating a response to the current query based at least in 

8 part on the retrieved superunit membership information. 

1 42. A computer-implemented method for forming a cluster firom a concept 

2 network, the concept network including a plurality of units and a plurality of relationships 

3 defined between the units, wherein each relationship has a associated edge weight, the 

4 method comprising the acts of: 

5 selecting a base unit and a candidate unit fi"om the concept network; 

6 identifying a plurality of neighbor units of the base unit, wherein each 

7 neighbor unit has a relationship in the concept network to the base unit; 

8 identifying at least one of the neighbor units as a matched unit, wherein the 

9 matched unit has a relationship in the concept network to the candidate unit; 



46 



10 computing a clustering weight for the candidate unit based on the plurality of 

1 1 neighbor units including the at least one matched unit; and 

12 based on the clustering weight, determining whether to include the candidate 

1 3 unit in a cluster with the base unit. 

1 43. The method of claim 42, further comprising the acts of: 

2 selecting a second candidate unit; and 

3 using the second candidate unit repeating the acts of identifying at least one of 

4 the neighbor units as a matched unit, computing a clustering weight, and determining, thereby 

5 determining whether to include the second candidate imit in the cluster. 

1 44. A computer-implemented method for forming a clique from a concept 

2 network, the concept network including a plurality of units and a plurality of relationships 

3 defined between the units, wherein each relationship has an associated edge weight, the 

4 method comprising the acts of: 

5 forming a plurality of clusters, wherein each cluster includes at least a base 

6 unit; 

7 selecting one of the plurality of clusters as a starting cluster; 

8 initializing a clique to include only the base unit of the starting cluster; and 

9 for each member unit u of the starting cluster, adding the member unit u to the 

10 clique in the event that: 

1 1 (a) the fraction of current members of the clique that are also members 

12 of the one of the clusters that has member unit u as the base unit is equal to or greater than a 

13 first threshold value; and 

14 (b) the fraction of clusters having current clique members as base units 

1 5 that also include member xmit u is equal to or greater than a second threshold value. 

1 45. The method of claim 44, wherein the first threshold value and the 

2 second threshold value are each equal to 100%. 

1 46. The method of claim 44, wherein the first threshold value and the 

2 second threshold value are each equal to about 70%. 
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